NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Primacy Effect of ChatGPT

https://doi.org/10.18653/v1/2023.emnlp-main.8

Wang, Yiwei; Cai, Yujun; Chen, Muhao; Liang, Yuxuan; Hooi, Bryan (January 2023, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing)

Instruction-tuned large language models (LLMs), such as ChatGPT, have led to promising zero-shot performance in discriminative natural language understanding (NLU) tasks. This involves querying the LLM using a prompt containing the question, and the candidate labels to choose from. The question-answering capabilities of ChatGPT arise from its pre-training on large amounts of human-written text, as well as its subsequent fine-tuning on human preferences, which motivates us to ask: Does ChatGPT also inherit humans’ cognitive biases? In this paper, we study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer. We have two main findings: i) ChatGPT’s decision is sensitive to the order of labels in the prompt; ii) ChatGPT has a clearly higher chance to select the labels at earlier positions as the answer. We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions. We release the source code at https://github.com/wangywUST/PrimacyEffectGPT.
more » « less
How Fragile is Relation Extraction under Entity Replacements?

https://doi.org/10.18653/v1/2023.conll-1.27

Wang, Yiwei; Hooi, Bryan; Wang, Fei; Cai, Yujun; Liang, Yuxuan; Zhou, Wenxuan; Tang, Jing; Duan, Manjuan; Chen, Muhao (January 2023, Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL))

Relation extraction (RE) aims to extract the relations between entity names from the textual context. In principle, textual context determines the ground-truth relation and the RE models should be able to correctly identify the relations reflected by the textual context. However, existing work has found that the RE models memorize the entity name patterns to make RE predictions while ignoring the textual context. This motivates us to raise the question: are RE models robust to the entity replacements? In this work, we operate the random and type-constrained entity replacements over the RE instances in TACRED and evaluate the state-of-the-art RE models under the entity replacements. We observe the 30% - 50% F1 score drops on the state-of-the-art RE models under entity replacements. These results suggest that we need more efforts to develop effective RE models robust to entity replacements. We release the source code at https://github.com/wangywUST/RobustRE.
more » « less
GraphCache: Message Passing as Caching for Sentence-Level Relation Extraction

https://doi.org/10.18653/v1/2022.findings-naacl.128

Wang, Yiwei; Chen, Muhao; Zhou, Wenxuan; Cai, Yujun; Liang, Yuxuan; Hooi, Bryan (January 2022, Findings of the Association for Computational Linguistics: NAACL 2022)

Entity types and textual context are essential properties for sentence-level relation extraction (RE). Existing work only encodes these properties within individual instances, which limits the performance of RE given the insufficient features in a single sentence. In contrast, we model these properties from the whole dataset and use the dataset-level information to enrich the semantics of every instance. We propose the GraphCache (Graph Neural Network as Caching) module, that propagates the features across sentences to learn better representations for RE. GraphCache aggregates the features from sentences in the whole dataset to learn global representations of properties, and use them to augment the local features within individual sentences. The global property features act as dataset-level prior knowledge for RE, and a complement to the sentence-level features. Inspired by the classical caching technique in computer systems, we develop GraphCache to update the property representations in an online manner. Overall, GraphCache yields significant effectiveness gains on RE and enables efficient message passing across all sentences in the dataset.
more » « less
Full Text Available
Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis

https://doi.org/10.18653/v1/2022.naacl-main.224

Wang, Yiwei; Chen, Muhao; Zhou, Wenxuan; Cai, Yujun; Liang, Yuxuan; Liu, Dayiheng; Yang, Baosong; Liu, Juncheng; Hooi, Bryan (January 2022, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies)

Recent literature focuses on utilizing the entity information in the sentence-level relation extraction (RE), but this risks leaking superficial and spurious clues of relations. As a result, RE still suffers from unintended entity bias, i.e., the spurious correlation between entity mentions (names) and relations. Entity bias can mislead the RE models to extract the relations that do not exist in the text. To combat this issue, some previous work masks the entity mentions to prevent the RE models from over-fitting entity mentions. However, this strategy degrades the RE performance because it loses the semantic information of entities. In this paper, we propose the CoRE (Counterfactual Analysis based Relation Extraction) debiasing method that guides the RE models to focus on the main effects of textual context without losing the entity information. We first construct a causal graph for RE, which models the dependencies between variables in RE models. Then, we propose to conduct counterfactual analysis on our causal graph to distill and mitigate the entity bias, that captures the causal effects of specific entity mentions in each instance. Note that our CoRE method is model-agnostic to debias existing RE systems during inference without changing their training processes. Extensive experimental results demonstrate that our CoRE yields significant gains on both effectiveness and generalization for RE. The source code is provided at: https://github.com/vanoracai/CoRE.
more » « less
Full Text Available

Search for: All records